From Gene Trees to Species Trees

نویسندگان

  • Bin Ma
  • Ming Li
  • Louxin Zhang
چکیده

This paper studies various algorithmic issues in reconstructing a species tree from gene trees under the duplication and the mutation cost model. This is a fundamental problem in computational molecular biology. Our main results are as follows. 1. A linear time algorithm is presented for computing all the losses in duplications associated with the least common ancestor mapping from a gene tree to a species tree. This answers a problem raised recently by Eulenstein et al. (1998). 2. The complexity of nding an optimal species tree from gene trees is studied. The problem is proved to be NP-hard for the duplication cost and for the mutation cost. Further, the concept of reconciled trees was introduced by Goodman et al. and formalized by Page for visualizing the relationship between gene and species trees. We show that constructing an optimal reconciled tree for gene trees is also NP-hard. Finally, we consider a general reconstruction problem and show it to be NP-hard even for the well-known nearest neighbor interchange distance. 3. A new and e ciently computable metric is de ned based on the duplication cost. We show that the problem of nding an optimal species tree from gene trees is NP-hard under this new metric but it can be approximated within factor 2 in polynomial time. Using this approximation result, we propose a heuristic method for nding a species tree from gene trees with uniquely labeled leaves under the duplication cost. Our experimental tests demonstrate that, when the number of species is larger than 15 and gene trees are close to each other, our heuristic method is signi cantly better than the existing program in Page's GeneTree 1.0 that starts the search from a random tree.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantitative Comparison of Tree Pairs Resulted from Gene and Protein Phylogenetic Trees for Sulfite Reductase Flavoprotein Alpha-Component and 5S rRNA and Taxonomic Trees in Selected Bacterial Species

Introduction: FAD is the cofactor of FAD-FR protein family. Sulfite reductase flavoprotein alpha-component is one of the main enzymes of this family. Based on applications of this enzyme in biotechnology and industry, it was chosen as the subject of evolutionary studies in 19 specific species. Method: Gene and protein sequences of sulfite reductase flavoprotein alpha-component, 5S rRNA sequence...

متن کامل

Quantitative Comparison of Tree Pairs Resulted from Gene and Protein Phylogenetic Trees for Sulfite Reductase Flavoprotein Alpha-Component and 5S rRNA and Taxonomic Trees in Selected Bacterial Species

Introduction: FAD is the cofactor of FAD-FR protein family. Sulfite reductase flavoprotein alpha-component is one of the main enzymes of this family. Based on applications of this enzyme in biotechnology and industry, it was chosen as the subject of evolutionary studies in 19 specific species. Method: Gene and protein sequences of sulfite reductase flavoprotein alpha-component, 5S rRNA sequence...

متن کامل

Study on phylogenetic status of Hari barbel Luciobarbus conocephalus (Kessler, 1872) from Hari river using Cytb gene

Recently, Luciobarbus conocephalus from the Hari River was reported for the first time, but there is doubt about the validity of this species between authors, because some of them placed it as a subspecies or synonym of L. capito. Therefore, the present study was conducted to investigate the status of phylogeny and the validity of this species. For this purpose, specimens captured from Hari Riv...

متن کامل

Study of ectomycorrhizal fungi with beech trees in highland beech forests (Farim, Mazandaran province)

In this study, the ectomycorhizal fungi from beech trees in highland beech forests of Farim (Mazandaran province) were identified based on extraction of DNA from roots and sequencing the ITS region of nuclear ribosomal DNA. For this purpose, in the altitude of 1500-2100 meters A.S.L, 30 plot and one plant per each plot were selected randomly and samples were taken from roots in depths of 10 cm ...

متن کامل

Estimating Height and Diameter Growth of Some Street Trees in Urban Green Spaces

Estimating urban trees growth, especially tree height is very important in urban landscape management. The aim of the study was to predict of tree height base on tree diameter. To achieve this goal, 921 trees from five species were measured in five areas of Mashhad city in 2014. The evaluated trees were ash tree (Fraxinus species), plane tree (Platanus hybrida), white mulberry (Morus alba), ail...

متن کامل

Isolation and molecular characterization of Cryptococcus species isolated from pigeon nests and Eucalyptus trees

Background and Purpose: Cryptococcus species are pathogenic and non-pathogenic basidiomycete yeasts that are found widely in the environment. Based on phenotypic methods, this genus has many species; however, its taxonomy is presently being re- evaluated by modern techniques. The Cryptococcus species complex includes two sibling taxa of Cryptococcus neoformans and Cryptococcus gattii. We aimed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Comput.

دوره 30  شماره 

صفحات  -

تاریخ انتشار 2000